Overview

Dataset statistics

Number of variables14
Number of observations891
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory85.4 KiB
Average record size in memory98.1 B

Variable types

Numeric9
Categorical5

Alerts

Name has a high cardinality: 891 distinct values High cardinality
Survived is highly correlated with SexHigh correlation
Pclass is highly correlated with FareHigh correlation
Sex is highly correlated with Survived and 1 other fieldsHigh correlation
SibSp is highly correlated with SibParHigh correlation
Parch is highly correlated with SibParHigh correlation
Fare is highly correlated with PclassHigh correlation
Title_Name is highly correlated with SexHigh correlation
SibPar is highly correlated with SibSp and 1 other fieldsHigh correlation
Survived is highly correlated with SexHigh correlation
Pclass is highly correlated with FareHigh correlation
Sex is highly correlated with Survived and 1 other fieldsHigh correlation
SibSp is highly correlated with SibParHigh correlation
Parch is highly correlated with SibParHigh correlation
Ticket is highly correlated with Ticket-labelHigh correlation
Fare is highly correlated with PclassHigh correlation
Title_Name is highly correlated with SexHigh correlation
Ticket-label is highly correlated with TicketHigh correlation
SibPar is highly correlated with SibSp and 1 other fieldsHigh correlation
Survived is highly correlated with SexHigh correlation
Pclass is highly correlated with FareHigh correlation
Sex is highly correlated with Survived and 1 other fieldsHigh correlation
SibSp is highly correlated with SibParHigh correlation
Parch is highly correlated with SibParHigh correlation
Fare is highly correlated with PclassHigh correlation
Title_Name is highly correlated with SexHigh correlation
SibPar is highly correlated with SibSp and 1 other fieldsHigh correlation
Survived is highly correlated with SexHigh correlation
Sex is highly correlated with SurvivedHigh correlation
Survived is highly correlated with Sex and 1 other fieldsHigh correlation
Pclass is highly correlated with FareHigh correlation
Sex is highly correlated with Survived and 1 other fieldsHigh correlation
Age is highly correlated with Title_NameHigh correlation
SibSp is highly correlated with Parch and 2 other fieldsHigh correlation
Parch is highly correlated with SibSp and 2 other fieldsHigh correlation
Ticket is highly correlated with Ticket-labelHigh correlation
Fare is highly correlated with Pclass and 1 other fieldsHigh correlation
Title_Name is highly correlated with Survived and 4 other fieldsHigh correlation
Ticket-label is highly correlated with TicketHigh correlation
SibPar is highly correlated with SibSp and 2 other fieldsHigh correlation
PassengerId is uniformly distributed Uniform
Name is uniformly distributed Uniform
PassengerId has unique values Unique
Name has unique values Unique
SibSp has 608 (68.2%) zeros Zeros
Parch has 678 (76.1%) zeros Zeros
Fare has 15 (1.7%) zeros Zeros
Ticket-label has 28 (3.1%) zeros Zeros
SibPar has 749 (84.1%) zeros Zeros

Reproduction

Analysis started2021-11-28 01:54:22.634463
Analysis finished2021-11-28 01:54:55.245317
Duration32.61 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

PassengerId
Real number (ℝ≥0)

UNIFORM
UNIQUE

Distinct891
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean446
Minimum1
Maximum891
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2021-11-27T19:54:55.752905image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile45.5
Q1223.5
median446
Q3668.5
95-th percentile846.5
Maximum891
Range890
Interquartile range (IQR)445

Descriptive statistics

Standard deviation257.353842
Coefficient of variation (CV)0.5770265516
Kurtosis-1.2
Mean446
Median Absolute Deviation (MAD)223
Skewness0
Sum397386
Variance66231
MonotonicityStrictly increasing
2021-11-27T19:54:56.015880image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11
 
0.1%
5991
 
0.1%
5881
 
0.1%
5891
 
0.1%
5901
 
0.1%
5911
 
0.1%
5921
 
0.1%
5931
 
0.1%
5941
 
0.1%
5951
 
0.1%
Other values (881)881
98.9%
ValueCountFrequency (%)
11
0.1%
21
0.1%
31
0.1%
41
0.1%
51
0.1%
61
0.1%
71
0.1%
81
0.1%
91
0.1%
101
0.1%
ValueCountFrequency (%)
8911
0.1%
8901
0.1%
8891
0.1%
8881
0.1%
8871
0.1%
8861
0.1%
8851
0.1%
8841
0.1%
8831
0.1%
8821
0.1%

Survived
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size7.1 KiB
0
549 
1
342 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0549
61.6%
1342
38.4%

Length

2021-11-27T19:54:56.395680image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-11-27T19:54:56.516454image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0549
61.6%
1342
38.4%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Pclass
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size7.1 KiB
3
491 
1
216 
2
184 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3
2nd row1
3rd row3
4th row1
5th row3

Common Values

ValueCountFrequency (%)
3491
55.1%
1216
24.2%
2184
 
20.7%

Length

2021-11-27T19:54:56.655549image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-11-27T19:54:56.790562image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
3491
55.1%
1216
24.2%
2184
 
20.7%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Name
Categorical

HIGH CARDINALITY
UNIFORM
UNIQUE

Distinct891
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size7.1 KiB
Giles, Mr. Frederick Edward
 
1
Scanlan, Mr. James
 
1
Hays, Miss. Margaret Bechstein
 
1
Kassem, Mr. Fared
 
1
Hickman, Mr. Lewis
 
1
Other values (886)
886 

Length

Max length82
Median length25
Mean length26.96520763
Min length12

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique891 ?
Unique (%)100.0%

Sample

1st rowBraund, Mr. Owen Harris
2nd rowCumings, Mrs. John Bradley (Florence Briggs Thayer)
3rd rowHeikkinen, Miss. Laina
4th rowFutrelle, Mrs. Jacques Heath (Lily May Peel)
5th rowAllen, Mr. William Henry

Common Values

ValueCountFrequency (%)
Giles, Mr. Frederick Edward1
 
0.1%
Scanlan, Mr. James1
 
0.1%
Hays, Miss. Margaret Bechstein1
 
0.1%
Kassem, Mr. Fared1
 
0.1%
Hickman, Mr. Lewis1
 
0.1%
Goodwin, Mrs. Frederick (Augusta Tyler)1
 
0.1%
de Pelsmaeker, Mr. Alfons1
 
0.1%
Pinsky, Mrs. (Rosa)1
 
0.1%
Olsen, Mr. Ole Martin1
 
0.1%
Augustsson, Mr. Albert1
 
0.1%
Other values (881)881
98.9%

Length

2021-11-27T19:54:57.022295image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
mr521
 
14.4%
miss182
 
5.0%
mrs129
 
3.6%
william64
 
1.8%
john44
 
1.2%
master40
 
1.1%
henry35
 
1.0%
george24
 
0.7%
james24
 
0.7%
charles23
 
0.6%
Other values (1515)2538
70.0%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Sex
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size7.1 KiB
1
577 
0
314 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row0
4th row0
5th row1

Common Values

ValueCountFrequency (%)
1577
64.8%
0314
35.2%

Length

2021-11-27T19:54:57.269458image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-11-27T19:54:57.391847image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
1577
64.8%
0314
35.2%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Age
Real number (ℝ≥0)

HIGH CORRELATION

Distinct146
Distinct (%)16.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean29.96197755
Minimum0.42
Maximum80
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2021-11-27T19:54:57.554574image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0.42
5-th percentile5
Q121
median29
Q337
95-th percentile54.5
Maximum80
Range79.58
Interquartile range (IQR)16

Descriptive statistics

Standard deviation13.70007462
Coefficient of variation (CV)0.4572486779
Kurtosis0.3920995024
Mean29.96197755
Median Absolute Deviation (MAD)8
Skewness0.3782158678
Sum26696.122
Variance187.6920447
MonotonicityNot monotonic
2021-11-27T19:54:57.792712image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2430
 
3.4%
3028
 
3.1%
2227
 
3.0%
3327
 
3.0%
2826
 
2.9%
1826
 
2.9%
1925
 
2.8%
2524
 
2.7%
2124
 
2.7%
3622
 
2.5%
Other values (136)632
70.9%
ValueCountFrequency (%)
0.421
 
0.1%
0.671
 
0.1%
0.752
 
0.2%
0.832
 
0.2%
0.921
 
0.1%
17
0.8%
210
1.1%
36
0.7%
410
1.1%
4.5342
 
0.2%
ValueCountFrequency (%)
801
 
0.1%
741
 
0.1%
712
0.2%
70.51
 
0.1%
702
0.2%
661
 
0.1%
653
0.3%
642
0.2%
632
0.2%
624
0.4%

SibSp
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct7
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.5230078563
Minimum0
Maximum8
Zeros608
Zeros (%)68.2%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2021-11-27T19:54:58.058124image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile3
Maximum8
Range8
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.102743432
Coefficient of variation (CV)2.108464374
Kurtosis17.88041973
Mean0.5230078563
Median Absolute Deviation (MAD)0
Skewness3.695351727
Sum466
Variance1.216043077
MonotonicityNot monotonic
2021-11-27T19:54:58.239880image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0608
68.2%
1209
 
23.5%
228
 
3.1%
418
 
2.0%
316
 
1.8%
87
 
0.8%
55
 
0.6%
ValueCountFrequency (%)
0608
68.2%
1209
 
23.5%
228
 
3.1%
316
 
1.8%
418
 
2.0%
55
 
0.6%
87
 
0.8%
ValueCountFrequency (%)
87
 
0.8%
55
 
0.6%
418
 
2.0%
316
 
1.8%
228
 
3.1%
1209
 
23.5%
0608
68.2%

Parch
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct7
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.3815937149
Minimum0
Maximum6
Zeros678
Zeros (%)76.1%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2021-11-27T19:54:58.449374image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum6
Range6
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.8060572211
Coefficient of variation (CV)2.112344071
Kurtosis9.778125179
Mean0.3815937149
Median Absolute Deviation (MAD)0
Skewness2.749117047
Sum340
Variance0.6497282437
MonotonicityNot monotonic
2021-11-27T19:54:58.618626image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0678
76.1%
1118
 
13.2%
280
 
9.0%
35
 
0.6%
55
 
0.6%
44
 
0.4%
61
 
0.1%
ValueCountFrequency (%)
0678
76.1%
1118
 
13.2%
280
 
9.0%
35
 
0.6%
44
 
0.4%
55
 
0.6%
61
 
0.1%
ValueCountFrequency (%)
61
 
0.1%
55
 
0.6%
44
 
0.4%
35
 
0.6%
280
 
9.0%
1118
 
13.2%
0678
76.1%

Ticket
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct679
Distinct (%)76.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean754682.7228
Minimum0
Maximum23101294
Zeros4
Zeros (%)0.4%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2021-11-27T19:54:58.834583image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2343
Q117421
median113510
Q3347465
95-th percentile521173.5
Maximum23101294
Range23101294
Interquartile range (IQR)330044

Descriptive statistics

Standard deviation3424853.649
Coefficient of variation (CV)4.538137082
Kurtosis38.00287833
Mean754682.7228
Median Absolute Deviation (MAD)111196
Skewness6.259104267
Sum672422306
Variance1.172962252 × 1013
MonotonicityNot monotonic
2021-11-27T19:54:59.104082image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
23437
 
0.8%
16017
 
0.8%
3470827
 
0.8%
31012956
 
0.7%
21446
 
0.7%
3470886
 
0.7%
148795
 
0.6%
3826525
 
0.6%
66084
 
0.4%
04
 
0.4%
Other values (669)834
93.6%
ValueCountFrequency (%)
04
0.4%
32
 
0.2%
5411
 
0.1%
6931
 
0.1%
6951
 
0.1%
7512
 
0.2%
7521
 
0.1%
11661
 
0.1%
15851
 
0.1%
16017
0.8%
ValueCountFrequency (%)
231012941
0.1%
231012931
0.1%
231012921
0.1%
231012901
0.1%
231012891
0.1%
231012881
0.1%
231012871
0.1%
231012861
0.1%
231012851
0.1%
231012831
0.1%

Fare
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct248
Distinct (%)27.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean32.20420797
Minimum0
Maximum512.3292
Zeros15
Zeros (%)1.7%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2021-11-27T19:54:59.351221image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile7.225
Q17.9104
median14.4542
Q331
95-th percentile112.07915
Maximum512.3292
Range512.3292
Interquartile range (IQR)23.0896

Descriptive statistics

Standard deviation49.6934286
Coefficient of variation (CV)1.543072528
Kurtosis33.39814088
Mean32.20420797
Median Absolute Deviation (MAD)6.9042
Skewness4.78731652
Sum28693.9493
Variance2469.436846
MonotonicityNot monotonic
2021-11-27T19:54:59.620718image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8.0543
 
4.8%
1342
 
4.7%
7.895838
 
4.3%
7.7534
 
3.8%
2631
 
3.5%
10.524
 
2.7%
7.92518
 
2.0%
7.77516
 
1.8%
015
 
1.7%
26.5515
 
1.7%
Other values (238)615
69.0%
ValueCountFrequency (%)
015
1.7%
4.01251
 
0.1%
51
 
0.1%
6.23751
 
0.1%
6.43751
 
0.1%
6.451
 
0.1%
6.49582
 
0.2%
6.752
 
0.2%
6.85831
 
0.1%
6.951
 
0.1%
ValueCountFrequency (%)
512.32923
0.3%
2634
0.4%
262.3752
0.2%
247.52082
0.2%
227.5254
0.4%
221.77921
 
0.1%
211.51
 
0.1%
211.33753
0.3%
164.86672
0.2%
153.46253
0.3%

Embarked
Categorical

Distinct4
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size7.1 KiB
2.0
644 
0.0
168 
1.0
77 
1.2
 
2

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2.0
2nd row0.0
3rd row2.0
4th row2.0
5th row2.0

Common Values

ValueCountFrequency (%)
2.0644
72.3%
0.0168
 
18.9%
1.077
 
8.6%
1.22
 
0.2%

Length

2021-11-27T19:54:59.852292image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-11-27T19:54:59.990309image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
2.0644
72.3%
0.0168
 
18.9%
1.077
 
8.6%
1.22
 
0.2%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Title_Name
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct7
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.843995511
Minimum1
Maximum7
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2021-11-27T19:55:00.177584image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q33
95-th percentile4
Maximum7
Range6
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.233117624
Coefficient of variation (CV)0.668720513
Kurtosis3.853630611
Mean1.843995511
Median Absolute Deviation (MAD)0
Skewness1.790787467
Sum1643
Variance1.520579074
MonotonicityNot monotonic
2021-11-27T19:55:00.344278image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
1517
58.0%
3182
 
20.4%
2125
 
14.0%
440
 
4.5%
714
 
1.6%
67
 
0.8%
56
 
0.7%
ValueCountFrequency (%)
1517
58.0%
2125
 
14.0%
3182
 
20.4%
440
 
4.5%
56
 
0.7%
67
 
0.8%
714
 
1.6%
ValueCountFrequency (%)
714
 
1.6%
67
 
0.8%
56
 
0.7%
440
 
4.5%
3182
 
20.4%
2125
 
14.0%
1517
58.0%

Ticket-label
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct30
Distinct (%)3.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9.693602694
Minimum0
Maximum29
Zeros28
Zeros (%)3.1%
Negative0
Negative (%)0.0%
Memory size1019.0 B
2021-11-27T19:55:00.686360image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3
Q19
median9
Q39
95-th percentile24
Maximum29
Range29
Interquartile range (IQR)0

Descriptive statistics

Standard deviation4.864611205
Coefficient of variation (CV)0.5018372796
Kurtosis5.708221254
Mean9.693602694
Median Absolute Deviation (MAD)0
Skewness1.995259674
Sum8637
Variance23.66444217
MonotonicityNot monotonic
2021-11-27T19:55:00.897799image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
9661
74.2%
1060
 
6.7%
341
 
4.6%
028
 
3.1%
2618
 
2.0%
2415
 
1.7%
2810
 
1.1%
187
 
0.8%
206
 
0.7%
25
 
0.6%
Other values (20)40
 
4.5%
ValueCountFrequency (%)
028
 
3.1%
11
 
0.1%
25
 
0.6%
341
 
4.6%
41
 
0.1%
51
 
0.1%
65
 
0.6%
71
 
0.1%
84
 
0.4%
9661
74.2%
ValueCountFrequency (%)
293
 
0.3%
2810
1.1%
272
 
0.2%
2618
2.0%
251
 
0.1%
2415
1.7%
232
 
0.2%
223
 
0.3%
211
 
0.1%
206
 
0.7%

SibPar
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct10
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.5679012346
Minimum0
Maximum16
Zeros749
Zeros (%)84.1%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2021-11-27T19:55:01.117119image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile4
Maximum16
Range16
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.979286552
Coefficient of variation (CV)3.485265449
Kurtosis32.05865416
Mean0.5679012346
Median Absolute Deviation (MAD)0
Skewness5.236253299
Sum506
Variance3.917575253
MonotonicityNot monotonic
2021-11-27T19:55:01.300210image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
0749
84.1%
157
 
6.4%
226
 
2.9%
416
 
1.8%
310
 
1.1%
69
 
1.0%
89
 
1.0%
167
 
0.8%
105
 
0.6%
53
 
0.3%
ValueCountFrequency (%)
0749
84.1%
157
 
6.4%
226
 
2.9%
310
 
1.1%
416
 
1.8%
53
 
0.3%
69
 
1.0%
89
 
1.0%
105
 
0.6%
167
 
0.8%
ValueCountFrequency (%)
167
 
0.8%
105
 
0.6%
89
 
1.0%
69
 
1.0%
53
 
0.3%
416
 
1.8%
310
 
1.1%
226
 
2.9%
157
 
6.4%
0749
84.1%

Interactions

2021-11-27T19:54:51.540399image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:30.540022image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:33.599604image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:36.150940image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:38.734756image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:41.602427image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:43.985692image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:46.463152image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:49.070256image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:51.819166image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:31.045720image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:33.864961image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:36.410804image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:39.015271image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:41.851248image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:44.229182image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:46.729839image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:49.322711image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:52.117390image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:31.468301image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:34.171585image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:36.704322image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:39.494729image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:42.121937image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:44.521909image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:47.021439image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:49.608153image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:52.424923image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:31.802860image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:34.461363image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:36.999811image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:39.980618image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:42.395978image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:44.809231image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:47.309524image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:49.916189image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:52.707075image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:32.131618image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:34.733710image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:37.271958image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:40.236939image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:42.653050image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:45.077127image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:47.581304image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:50.174791image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:52.989364image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:32.463413image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:35.008790image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:37.544151image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:40.497585image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:42.918729image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:45.328298image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:47.849069image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:50.439133image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:53.279614image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:32.759215image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:35.275534image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:37.867329image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:40.766182image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:43.176167image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:45.585538image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:48.240124image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:50.697559image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:53.564302image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:33.045258image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:35.548983image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:38.152795image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:41.065879image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:43.429717image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:45.889688image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:48.498313image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:50.975739image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:53.850182image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:33.301475image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:35.832046image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:38.431833image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:41.326522image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:43.705517image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:46.176980image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:48.779617image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-11-27T19:54:51.264097image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Correlations

2021-11-27T19:55:01.571826image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-11-27T19:55:02.026874image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-11-27T19:55:02.527881image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-11-27T19:55:02.943830image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2021-11-27T19:55:03.248512image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2021-11-27T19:54:54.409966image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
A simple visualization of nullity by column.
2021-11-27T19:54:55.011344image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareEmbarkedTitle_NameTicket-labelSibPar
0103Braund, Mr. Owen Harris122.0105211717.25002.0100
1211Cumings, Mrs. John Bradley (Florence Briggs Thayer)038.0101759971.28330.02100
2313Heikkinen, Miss. Laina026.000231012827.92502.03260
3411Futrelle, Mrs. Jacques Heath (Lily May Peel)035.01011380353.10002.0290
4503Allen, Mr. William Henry135.0003734508.05002.0190
5603Moran, Mr. James125.0003308778.45831.0190
6701McCarthy, Mr. Timothy J154.0001746351.86252.0190
7803Palsson, Master. Gosta Leonard12.03134990921.07502.0493
8913Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)027.00234774211.13332.0290
91012Nasser, Mrs. Nicholas (Adele Achem)014.01023773630.07080.0290

Last rows

PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareEmbarkedTitle_NameTicket-labelSibPar
88188203Markun, Mr. Johann133.0003492577.89582.0190
88288303Dahlberg, Miss. Gerda Ulrika022.000755210.51672.0390
88388402Banfield, Mr. Frederick James128.0003406810.50002.0140
88488503Sutehall, Mr. Henry Jr125.0003920767.05002.01240
88588603Rice, Mrs. William (Margaret Norton)039.00538265229.12501.0290
88688702Montvila, Rev. Juozas127.00021153613.00002.0590
88788811Graham, Miss. Margaret Edith019.00011205330.00002.0390
88888903Johnston, Miss. Catherine Helen "Carrie"013.212660723.45002.03282
88989011Behr, Mr. Karl Howell126.00011136930.00000.0190
89089103Dooley, Mr. Patrick132.0003703767.75001.0190